Faster Game Solving via Predictive Blackwell Approachability: Connecting Regret Matching and Mirror Descent

نویسندگان

چکیده

Blackwell approachability is a framework for reasoning about repeated games with vector-valued payoffs. We introduce predictive approachability, where an estimate of the next payoff vector given, and decision maker tries to achieve better performance based on accuracy that estimator. In order derive algorithms we start by showing powerful connection between four well-known algorithms. Follow-the-regularized-leader (FTRL) online mirror descent (OMD) are most prevalent regret minimizers in convex optimization. spite this prevalence, matching (RM) matching+ (RM+) have been preferred practice solving large-scale (as local within counterfactual minimization framework). show RM RM+ result from running FTRL OMD, respectively, select halfspace force at all times underlying game. By applying variants or OMD connection, obtain algorithms, as well RM+. experiments across 18 common zero-sum extensive-form benchmark games, coupled converges vastly faster than fastest prior (CFR+, DCFR, LCFR) but two poker sometimes more orders magnitude.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Blackwell Approachability and Low-Regret Learning are Equivalent

We consider the celebrated Blackwell Approachability Theorem for two-player games with vector payoffs. We show that Blackwell’s result is equivalent, via efficient reductions, to the existence of “noregret” algorithms for Online Linear Optimization. Indeed, we show that any algorithm for one such problem can be efficiently converted into an algorithm for the other. We provide a useful applicati...

متن کامل

Blackwell Approachability and No-Regret Learning are Equivalent

We consider the celebrated Blackwell Approachability Theorem for two-player games with vector payoffs. Blackwell himself previously showed that the theorem implies the existence of a “noregret” algorithm for a simple online learning problem. We show that this relationship is in fact much stronger, that Blackwell’s result is equivalent to, in a very strong sense, the problem of regret minimizati...

متن کامل

Blackwell Approachability and Minimax Theory

This manuscript investigates the relationship between Blackwell Approachability, a stochastic vector-valued repeated game, and minimax theory, a single-play scalar-valued scenario. First, it is established in a general setting — one not permitting invocation of minimax theory — that Blackwell’s Approachability Theorem (Blackwell [1]) and its generalization due to Hou [6] are still valid. Second...

متن کامل

Shifting Regret, Mirror Descent, and Matrices

We consider the problem of online prediction in changing environments. In this framework the performance of a predictor is evaluated as the loss relative to an arbitrarily changing predictor, whose individual components come from a base class of predictors. Typical results in the literature consider different base classes (experts, linear predictors on the simplex, etc.) separately. Introducing...

متن کامل

Generalized Mixability Constant Regret, Generalized Mixability, and Mirror Descent

We consider the setting of prediction with expert advice; a learner makes predictions by aggregating those of a group of experts. Under this setting, and with the right choice of loss function and “mixing” algorithm, it is possible for the learner to achieve constant regret regardless of the number of prediction rounds. For example, constant regret can be achieved with mixable losses using the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2021

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v35i6.16676